2,097 research outputs found

    Personalized Pancreatic Tumor Growth Prediction via Group Learning

    Full text link
    Tumor growth prediction, a highly challenging task, has long been viewed as a mathematical modeling problem, where the tumor growth pattern is personalized based on imaging and clinical data of a target patient. Though mathematical models yield promising results, their prediction accuracy may be limited by the absence of population trend data and personalized clinical characteristics. In this paper, we propose a statistical group learning approach to predict the tumor growth pattern that incorporates both the population trend and personalized data, in order to discover high-level features from multimodal imaging data. A deep convolutional neural network approach is developed to model the voxel-wise spatio-temporal tumor progression. The deep features are combined with the time intervals and the clinical factors to feed a process of feature selection. Our predictive model is pretrained on a group data set and personalized on the target patient data to estimate the future spatio-temporal progression of the patient's tumor. Multimodal imaging data at multiple time points are used in the learning, personalization and inference stages. Our method achieves a Dice coefficient of 86.8% +- 3.6% and RVD of 7.9% +- 5.4% on a pancreatic tumor data set, outperforming the DSC of 84.4% +- 4.0% and RVD 13.9% +- 9.8% obtained by a previous state-of-the-art model-based method

    Heuristic Search over a Ranking for Feature Selection

    Get PDF
    In this work, we suggest a new feature selection technique that lets us use the wrapper approach for finding a well suited feature set for distinguishing experiment classes in high dimensional data sets. Our method is based on the relevance and redundancy idea, in the sense that a ranked-feature is chosen if additional information is gained by adding it. This heuristic leads to considerably better accuracy results, in comparison to the full set, and other representative feature selection algorithms in twelve well–known data sets, coupled with notable dimensionality reduction

    Digging into acceptor splice site prediction : an iterative feature selection approach

    Get PDF
    Feature selection techniques are often used to reduce data dimensionality, increase classification performance, and gain insight into the processes that generated the data. In this paper, we describe an iterative procedure of feature selection and feature construction steps, improving the classification of acceptor splice sites, an important subtask of gene prediction. We show that acceptor prediction can benefit from feature selection, and describe how feature selection techniques can be used to gain new insights in the classification of acceptor sites. This is illustrated by the identification of a new, biologically motivated feature: the AG-scanning feature. The results described in this paper contribute both to the domain of gene prediction, and to research in feature selection techniques, describing a new wrapper based feature weighting method that aids in knowledge discovery when dealing with complex datasets

    A Methodology for the Diagnostic of Aircraft Engine Based on Indicators Aggregation

    Full text link
    Aircraft engine manufacturers collect large amount of engine related data during flights. These data are used to detect anomalies in the engines in order to help companies optimize their maintenance costs. This article introduces and studies a generic methodology that allows one to build automatic early signs of anomaly detection in a way that is understandable by human operators who make the final maintenance decision. The main idea of the method is to generate a very large number of binary indicators based on parametric anomaly scores designed by experts, complemented by simple aggregations of those scores. The best indicators are selected via a classical forward scheme, leading to a much reduced number of indicators that are tuned to a data set. We illustrate the interest of the method on simulated data which contain realistic early signs of anomalies.Comment: Proceedings of the 14th Industrial Conference, ICDM 2014, St. Petersburg : Russian Federation (2014

    Is This a Joke? Detecting Humor in Spanish Tweets

    Full text link
    While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84% and a recall of 69%.Comment: Preprint version, without referra

    Predicting sentence translation quality using extrinsic and language independent features

    Get PDF
    We develop a top performing model for automatic, accurate, and language independent prediction of sentence-level statistical machine translation (SMT) quality with or without looking at the translation outputs. We derive various feature functions measuring the closeness of a given test sentence to the training data and the difficulty of translating the sentence. We describe \texttt{mono} feature functions that are based on statistics of only one side of the parallel training corpora and \texttt{duo} feature functions that incorporate statistics involving both source and target sides of the training data. Overall, we describe novel, language independent, and SMT system extrinsic features for predicting the SMT performance, which also rank high during feature ranking evaluations. We experiment with different learning settings, with or without looking at the translations, which help differentiate the contribution of different feature sets. We apply partial least squares and feature subset selection, both of which improve the results and we present ranking of the top features selected for each learning setting, providing an exhaustive analysis of the extrinsic features used. We show that by just looking at the test source sentences and not using the translation outputs at all, we can achieve better performance than a baseline system using SMT model dependent features that generated the translations. Furthermore, our prediction system is able to achieve the 22nd best performance overall according to the official results of the Quality Estimation Task (QET) challenge when also looking at the translation outputs. Our representation and features achieve the top performance in QET among the models using the SVR learning model

    Orientational instabilities in nematics with weak anchoring under combined action of steady flow and external fields

    Full text link
    We study the homogeneous and the spatially periodic instabilities in a nematic liquid crystal layer subjected to steady plane {\em Couette} or {\em Poiseuille} flow. The initial director orientation is perpendicular to the flow plane. Weak anchoring at the confining plates and the influence of the external {\em electric} and/or {\em magnetic} field are taken into account. Approximate expressions for the critical shear rate are presented and compared with semi-analytical solutions in case of Couette flow and numerical solutions of the full set of nematodynamic equations for Poiseuille flow. In particular the dependence of the type of instability and the threshold on the azimuthal and the polar anchoring strength and external fields is analysed.Comment: 12 pages, 6 figure

    A transfer-learning approach to feature extraction from cancer transcriptomes with deep autoencoders

    Get PDF
    Publicado en Lecture Notes in Computer Science.The diagnosis and prognosis of cancer are among the more challenging tasks that oncology medicine deals with. With the main aim of fitting the more appropriate treatments, current personalized medicine focuses on using data from heterogeneous sources to estimate the evolu- tion of a given disease for the particular case of a certain patient. In recent years, next-generation sequencing data have boosted cancer prediction by supplying gene-expression information that has allowed diverse machine learning algorithms to supply valuable solutions to the problem of cancer subtype classification, which has surely contributed to better estimation of patient’s response to diverse treatments. However, the efficacy of these models is seriously affected by the existing imbalance between the high dimensionality of the gene expression feature sets and the number of sam- ples available for a particular cancer type. To counteract what is known as the curse of dimensionality, feature selection and extraction methods have been traditionally applied to reduce the number of input variables present in gene expression datasets. Although these techniques work by scaling down the input feature space, the prediction performance of tradi- tional machine learning pipelines using these feature reduction strategies remains moderate. In this work, we propose the use of the Pan-Cancer dataset to pre-train deep autoencoder architectures on a subset com- posed of thousands of gene expression samples of very diverse tumor types. The resulting architectures are subsequently fine-tuned on a col- lection of specific breast cancer samples. This transfer-learning approach aims at combining supervised and unsupervised deep learning models with traditional machine learning classification algorithms to tackle the problem of breast tumor intrinsic-subtype classification.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
    corecore